126 PART 3 Getting Down and Dirty with Data

If you can’t find any transformation that makes your data look even approxi-

mately normal, then you have to analyze your data using nonparametric methods,

which don’t assume that your data are normally distributed.

Summarizing grouped data with bars,

boxes, and whiskers

Sometimes you want to show how a numerical variable differs from one group of

participants to another. For example, blood levels of a certain cardiovascular

enzyme vary among the cardiology patients at four different clinics: Clinic A, B, C,

and D. Two types of graphs are commonly used for this purpose: bar charts and

box-and-whiskers plots.

Bar charts

One simple way to display and compare the means of several groups of data is

with a bar chart, like the one shown in Figure 9-7a. Here, the bar height for each

group of patients equals the mean (or median, or geometric mean) value of the

enzyme level for patients at the clinic represented by the bar. And the bar chart

becomes even more informative if you indicate the spread of values for each clini-

cal sample by placing lines representing one SD above and below the tops of the

bars, as shown in Figure 9-7b. These lines are always referred to as error bars,

which is an unfortunate choice of words that can cause confusion when error bars

are added to a bar chart. In this case, error refers to statistical error (described in

Chapter 6).

But even with error bars, a bar chart still doesn’t provide a picture of the distribu-

tion of enzyme levels within each group. Are the values skewed? Are there outliers?

Imagine that you made a histogram for each subgroup of patients  — Clinic A,

Clinic B, Clinic C, and Clinic D. But if you think about it, four histograms would take

up a lot of space. There is a solution for this! Keep reading to find out what it is.

FIGURE 9-7:

Bar charts

showing mean

values (a) and

standard

deviations (b).

© John Wiley & Sons, Inc.